Towards XML Mining: The Role of Kernel Methods
نویسندگان
چکیده
XMLmining is a unique application of data mining, in that it deals with structured XML contents. The introductory paper provides a brief but comprehensive review of milestones towards XML mining. XML mining is not a one-day outcome by chance, but an accumulated inheritance of continuous evolution from data mining throughout text mining and web mining. Furthermore, the paper envisages the applications of kernel methods to XML mining. Preliminary results on schema-matching simulation reveal the kernel methods for structured data are an adequate tool for XML mining. keyword: Kernel methods, schema matching, structured data, text mining, XML mining
منابع مشابه
A Novel Approach to Measuring Structural Similarity between XML Documents
Measuring structural similarity between XML documents has become a key component in various applications, including XML mining, schema matching, and web service discovery, among others. This paper presents a novel structural similarity measure incorporating kernel methods into XML documents. Results on preliminary simulations show that this approach outperforms conventional ones.
متن کاملKernels for Semi-Structured Data
Semi-structured data such as XML and HTML is attracting considerable attention. It is important to develop various kinds of data mining techniques that can handle semistructured data. In this paper, we discuss applications of kernel methods for semistructured data. We model semi-structured data by labeled ordered trees, and present kernels for classifying labeled ordered trees based on their ta...
متن کاملKernel Methods and Visualization for Interval Data Mining
We propose to use kernel methods and visualization tool for mining interval data. When large datasets are aggregated into smaller data sizes we need more complex data tables e.g interval type instead of standard ones. Our investigation aims at extending kernel methods to interval data analysis and using graphical tools to explain the obtained results. The user deeply understands the models’ beh...
متن کاملPrototyping a Vibrato-Aware Query-By-Humming (QBH) Music Information Retrieval System for Mobile Communication Devices: Case of Chromatic Harmonica
Background and Aim: The current research aims at prototyping query-by-humming music information retrieval systems for smart phones. Methods: This multi-method research follows simulation technique from mixed models of the operations research methodology, and the documentary research method, simultaneously. Two chromatic harmonica albums comprised the research population. To achieve the purpose ...
متن کاملUtilizing the Structure and Data Information for XML Document Clustering
This paper reports on the experiments and results of a clustering approach used in the INEX 2008 Document Mining Challenge. The clustering approach utilizes both the structure and the content information of the XML documents in the Wikipedia collection. The content of the XML documents is measured using the latent semantic kernel (LSK). A well-known problem with the construction of latent seman...
متن کامل